Make partial aggregation adaptive by lukasz-stec · Pull Request #11011 · trinodb/trino

lukasz-stec · 2022-02-10T21:09:21Z

Description

This is an optimization for the HashAggregationOperator that is split into partial and final steps.
In case when partial aggregation step does not reduce the number of rows too much (e.g. 90 % of rows are unique) this step brings a small benefit in terms of network savings but costs a lot of CPU to do.
In this case, it would be beneficial to skip partial aggregation altogether at the planning time,
but given we don't always have reliable statistics for the number of unique values, especially in the intermediate query stages it is not easy to do.
Instead (although it's complementary to the planner changes) this adds simple runtime adaptation for the partial aggregation step, that sends raw, ungrouped rows to the final step if the ratio of unique to input rows is big enough (0.8 by default).

With this change, there is a still significant overhead on the partial step mainly in the PartitionedOutputOperator that has to handle the superfluous accumulator state for the raw rows + in the HashAggregationOperator that needs to create and populate this state.
There are potential improvements for this in both HashAggregationOperator and PartitionedOutputOperator that would limit the overhead.
Another possible approach is to have a separate pipe (as it has a different layout) from partial to final step with only the input pages without the accumulator state. This would eliminate almost all of the overhead but require larger changes in the core engine.

tpch/tpcds benchmark results for orc sf1000

part

overall ~6% TPCH and 1.5 % tpcds improvement. Most queries are not affected, some gain between 10 to 35%
adaptive-pa-part-nocode.pdf

uppart

overall 3.5% for tpch and 2.5% for tpcds
adaptive-pa-unpart-nocode.pdf

General information

Is this change a fix, improvement, new feature, refactoring, or other?

performance improvement

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)

core query engine (HashAggregationOperator)

How would you describe this change to a non-technical end user or system administrator?

Improves group by performance by skipping partial aggregation step

Related issues, pull requests, and links

Documentation

( x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

( ) No release notes entries required.
( x) Release notes entries required with the following suggested text:

Improve performance of GROUP BY with a large number of groups.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make partial aggregation adaptive#11011

Make partial aggregation adaptive#11011
sopel39 merged 4 commits into
trinodb:masterfrom
starburstdata:ls/adaptive-pa

lukasz-stec commented Feb 10, 2022

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

Conversation

lukasz-stec commented Feb 10, 2022

Description

tpch/tpcds benchmark results for orc sf1000

part

uppart

General information

Related issues, pull requests, and links

Documentation

Release notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants